The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics
نویسندگان
چکیده
Dialogue breakdown detection is a promising technique in dialogue systems. To promote the research and development of such a technique, we organized a dialogue breakdown detection challenge where the task is to detect a system’s inappropriate utterances that lead to dialogue breakdowns in chat. This paper describes the design, datasets, and evaluation metrics for the challenge as well as the methods and results of the submitted runs of the participants.
منابع مشابه
On Dialogue Breakdown: Annotation and Detection A report from the dialogue breakdown detection challenge
We organized the dialogue breakdown challenge, which is an evaluation workshop dedicated to the detection of breakdowns in Japanese human-machine chat-oriented dialogues. This paper reports analyses made on the breakdown annotation to the evaluation data used in the challenge and on behaviors of the detection systems submitted by six teams. The performances of the ensembles of the systems are a...
متن کاملCross-validating Image Description Datasets and Evaluation Metrics
The task of automatically generating sentential descriptions of image content has become increasingly popular in recent years, resulting in the development of large-scale image description datasets and the proposal of various metrics for evaluating image description generation systems. However, not much work has been done to analyse and understand both datasets and the metrics. In this paper, w...
متن کاملRelevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation
Automated metrics such as BLEU are widely used in the machine translation literature. They have also been used recently in the dialogue community for evaluating dialogue response generation. However, previous work in dialogue response generation has shown that these metrics do not correlate strongly with human judgment in the non task-oriented dialogue setting. Task-oriented dialogue responses ...
متن کاملUtterance Selection Based on Sentence Similarities and Dialogue Breakdown Detection on NTCIR-12 STC Task
This paper describes our contribution for the NTCIR-12 STC Japanese task. The purpose of the task is to retrieve tweets that suits as responses of a chat-oriented dialogue system from a huge number of tweets pool. Our system retrieves tweets based on following two steps: first it retrieves tweets that resemble to input sentences, and then, it filters inappropriate tweets in terms of the dialogu...
متن کاملExtractive meeting summarization through speaker zone detection
In this paper we investigate the role of discourse analysis in extractive meeting summarization task. Specifically our proposed method comprises of two distinct steps. First we use a meeting segmentation algorithm in order to detect various functional parts of the input meeting. Afterwards, a two level scoring mechanism in a graph-based framework is used to score each dialogue act in order to e...
متن کامل